Skip to content

feat(book): add heRef property for stable Hebrew reference identification#40

Merged
kdroidFilter merged 1 commit intodevfrom
feat/add-book-heref
Jan 24, 2026
Merged

feat(book): add heRef property for stable Hebrew reference identification#40
kdroidFilter merged 1 commit intodevfrom
feat/add-book-heref

Conversation

@kdroidFilter
Copy link
Owner

@kdroidFilter kdroidFilter commented Jan 24, 2026

Summary

  • Add heRef column to the book table for stable Hebrew reference identification
  • Enable lookup of books by their Hebrew reference across database regenerations
  • Similar pattern to existing line.heRef for lines

Changes

  • Database schema: Added heRef TEXT column and idx_book_heref index
  • Queries: Updated INSERT queries, added selectByHeRef query
  • Model: Added heRef: String? property with KDoc
  • Repository: Added getBookByHeRef() method
  • Generators: Set heRef from title (Otzaria) or heTitle (Sefaria)

Test plan

  • Build SeforimLibrary: ./gradlew :SeforimLibrary:build
  • Run tests: ./gradlew :SeforimLibrary:test
  • Verify database generation includes heRef values

@KleiKodesh
Copy link

KleiKodesh commented Jan 24, 2026

The heref references in Bavli (or any none tanach book, with exclusion of שלחן ערוך maybe) are interesting.
מנחות נג:, ב מנחות נג:, א
Are you referencing line numbers with hebrew letters?
Will the user actually see these references? If so, this might not be intuitive—it's not clear what the reference means. The Talmud (or any none tanach book) doesn’t naturally have built-in references to line numbers.

Perhaps I completely misunderstood the concept?
I’m assuming this is mainly for displaying search results?
on deeper anlsys it seems this is for comparison with sefaria but in that case - are you planning to enable direct db updates without having the user download the whle db? otherwise this is bloat on the user side?
well i am sure you know what you are doing me just a guy from the outside rmabling.....
i hope i am not disturbing.....

@kdroidFilter kdroidFilter changed the base branch from master to dev January 24, 2026 19:56
@kdroidFilter
Copy link
Owner Author

kdroidFilter commented Jan 24, 2026

@KleiKodesh The IDs of the books or lines are not fixed accross db version, which poses a problem for creating a history, an annotation system; this will be used for that purpose and perhaps for a deeplinking system.

@KleiKodesh
Copy link

can you clarify please: what you mean by the ids of the books or lines are not fixed:
i assume this is on the sefaria side not your db?

…tion

- Add heRef column to book table with index for efficient lookups
- Update Book model with heRef property and KDoc documentation
- Add selectByHeRef query and getBookByHeRef repository method
- Update insert queries to include heRef parameter
- Set heRef from title in Otzaria generator
- Set heRef from heTitle in Sefaria generator

This allows stable identification of books across database regenerations,
similar to how line.heRef works for lines.
@kdroidFilter
Copy link
Owner Author

@KleiKodesh The generator assigns an ID to each book in the order in which they appear in Sefaria, then Otzaria. If Sefaria decides to add a book at the beginning, this would shift the IDs of all subsequent books by one, and therefore also the IDs of the lines. As a result, these IDs are not stable and cannot be used for history, an annotation system, or even a deeplink system, which are supposed to remain consistent even if the database is updated.

@kdroidFilter kdroidFilter marked this pull request as ready for review January 24, 2026 20:13
@kdroidFilter kdroidFilter merged commit 68d300b into dev Jan 24, 2026
@KleiKodesh
Copy link

KleiKodesh commented Jan 24, 2026

apoligies i understand the concept yet i am still not clear on the final result
does this result in an id shift in the db too?
does this mean a regenration of the searchengine too?
rlrevance: i am busy implemnting a lightwehit dumb search engine for the word addin this maps the bookIds and line indexes or line ids.
it doesnt make a diffence to me which i use though line ids seems cleaner to me
if one of them is due to change with versioning then i would rather know?
if my logic is flawed i hope you can point this out to me.

@kdroidFilter
Copy link
Owner Author

@KleiKodesh Yes, the book and line IDs will inevitably change between versions, but as far as a search engine index is concerned, I think it is simpler to rebuild it entirely for each version rather than using this pattern, because books may be removed or added and you would then have to handle all of that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants